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(54) System and method of implementing disk ownership in networked storage 



ten to a specific location on each disk and the second 
ownership attribute is setting a SCSI-3 persistent reser- 
vation. In a system utilizing this disk ownership method, 
multiple file servers can read data from a given disk, but 
only the file server that owns a particular disk can write 
data to the disk. 
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(57) A system and method for disk ownership in a 
network storage system. Each disk has two ownership 
attributes set to show that a particular file server owns 
the disk. In a preferred embodiment the first ownership 
attribute is the serial number of the file server being writ- 



312 



FILE SYSTEM 
LAYER 

/ V ^ - - " ^ \ 
OC t >, ~o | 


HTTP 
414 


l ° ' 
CIFS 1 £ J 

410 « ; 


NFS : 
412 




niQk , 5 STORAGE 

f2E& = LAYER 
(RAID) | ] 




TCP 406 | ! 


UDP " 


-408 

422- 


OWNERSHIP I l 


IP LAYER I J 404 


1 i 420 
- TABLE l 1 — 


MEDIA ACCESSO I » A09 
LAYER I 1 — 


DISK ! ! DRIVER 
(SCSI) f i LAYER 
1 418 



TO LAN 
102 



FIG. 4 



7 \ 

TO SWITCHING 
NETWORK 
122 



Primed by Jouve. 75001 PARIS (FR) 



1 



EP 1 321 848 A2 



2 



Description 

Field of the Invention 

[0001] The present invention relates to networked file 5 
servers, and more particularly to disk ownership in net- 
worked file servers. 

Background of the Invention 

10 

[0002] A file server is a computer that provides file 
service relating to the organization of information on 
storage devices, such as disks. The file server or filer 
includes a storage operating system that implements a 
file system to logically organize the information as a hi- is 
erarchical structure of directories and files on the disks. 
Each "on -disk" file may be implemented as a set of data 
structures, e.g., disk blocks, configured to store infor- 
mation. A directory, conversely, may be implemented as 
a specialty formatted file in which information by other 20 
files and directories is stored. 

[0003] A filer may be further configured to operate ac- 
cording to a client/server model of information delivery 
to thereby allow many clients to access files stored on 
a server. In this model, the client may comprise an ap- 25 
plication, such as a database application, executing on 
a computer that connects to the filer over a computer 
network. This computer network could be a point to point 
link, a shared local area network (LAN), a wide area net- 
work (WAN) or a virtual private network (VPN) imple- 30 
mented over a public network such as the Internet. Each 
client may request the services of the file system on the 
filer by issuing file system protocol messages (typically 
in the form of packets) to the filer over the network. 
[0004] The disk storage typically implemented has 35 
one or more storage "volumes" comprised of a collection 
of physical storage disks, defining an overall logical ar- 
rangement of storage space. Currently available filer im- 
plementations can serve a large number of discrete vol- 
umes (150 or more, for example). Each volume is gen- *o 
erally associated with its own file system. The disks with- 
in a volume/file system are typically organized as one 
or more groups of Redundant Array of Independent (or 
Inexpensive) Disks (RAID). RAID implementations en- 
hance the reliability and integrity of data storage through 
the redundant writing of data stripes across a given 
number of physical disks in the RAID group, and the ap- 
propriate caching of parity information with respect to 
the striped data. In the example of a WAFL based file 
system and process, a RAID 4 implementation is advan- so 
tageously employed. This implementation specifically 
entails the striping of data across a group of disks, and 
separate parity caching within a selected disk of the 
RAID 4 group. 

[0005] Each filer is deemed to "own" the disks that 55 
comprise the volumes serviced by that filer. This own- 
ership means that the filer is responsible for servicing 
the data contained on those disks. Only the filer that 



owns a particular disk should be able to write data to 
that disk. This solo ownership helps to ensure data in- 
tegrity and coherency. In prior storage system imple- 
mentations, it is common for a filer to be connected to 
a local area network and a fibre channel loop. The fibre 
channel loop would have a plurality of disks attached 
thereto. As the filer would be the only device directly 
connected to the disks via the fibre channel loop, the 
filer owned the disks on that loop. However, a noted dis- 
advantage of the prior art is the lack of scalability, as 
there is a limit to a number of disks that may be added 
to a single fibre channel loop. This limitation prevents a 
system administrator from having backup filers connect- 
ed to the disks in the event of failure. 
[0006] In another prior storage system implementa- 
tion, two filers, which are utilized as a cluster, could be 
connected to a single disk drive through the use of the 
disk's A/B connector. The first filer would be connected 
to the A connection, while the second filer would be con- 
nected to the disk's B connection. In this implementa- 
tion, the filer connected to a disk's A connection is 
deemed to own that disk. If the disks are arrayed in a 
disk shelf, all of the disks contained within that disk shelf 
share a common connection to the A and B connections. 
Thus, afiler connected to the A connection of a disk shelf 
is deemed to own ail of the disks in that disk shelf. This 
lack of granularity (i.e. all disks on a shelf are owned by 
a single filer) is a known disadvantage with this type of 
implementation. 

[0007] Fig. 1 is a schematic block diagram of an ex- 
emplary network environment 100. The network 100 is 
based around a local area network (LAN) 102 intercon- 
nection. However, a wide area network (WAN), virtual 
private network (VPN), or a combination of LAN, WAN 
and VPM implementations can be established. For the 
purposes of this description the term LAN should be tak- 
en broadly to .include any acceptable networking archi- 
tecture. The LAN interconnects various clients based 
upon personal computers 104, servers 106 and a net- 
work cache 1 08. Also interconnected to the LAN may 
be a switch/router 110 that provides a gateway to the 
well-known Internet 112, thereby enabling various net- 
work devices to transmit and receive Internet based in- 
formation, including e-mail, web content, and the like. 
[0008] In this implementation, an exemplary filer 114 
is connected to the LAN 1 02. This filer, described further 
below is a file server configured to control storage of, 
and access to, data in a set of interconnected storage 
volumes. The filer is connected to a fibre channel loop 
118. A plurality of disks are also connected to this fibre 
channel loop. These disks comprise the volumes served 
by the filer. As described further below, each volume is 
typically organized to include one or more RAID groups 
of physical storage disks for increased data storage in- 
tegrity and reliability. As noted above, in one implemen- 
tation, each disk has an A/B connection. The disk's A 
connection could be connected to one fibre channel loop 
while the B connection is connected to a separate loop. 
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This capability can be utilized to generate redundant da- 
ta pathways to a disk. 

[0009] Each of the devices attached to the LAN in- 
clude an appropriate conventional network interface ar- 
rangement (not shewn) for communicating over the LAN 5 
using desired communication protocol such as the well- 
known Transport Control Protocol/Internet Protocol 
(TCP/IP), User Datagram Protocol (UDP), Hypertext 
Transfer Protocol (HTTP), or Simple Network Manage- 
ment Protocol (SNMP). 10 
[0010] One prior implementation of a storage system 
involves the use of switch zoning. Instead of the filer be- 
ing directly connected to the fibre channel loop, the filer 
would be connected to a fibre channel switch, which 
would then be connected to a plurality of fibre channel 15 
loops. Switch zoning is accomplished within the fibre 
channel switches by manually associating ports of the 
switch. This association with, and among, the ports 
would allow a filer connected to a port associated with 
a port connected to a fibre channel loop containing disks 20 
to "see" the disks within that loop. That is, the disks are 
visible to that port. However, a disadvantage of the 
switch zoning methodology was that a filer could only 
see what was within its zone. A zone is defined as all 
devices that are connected to ports associated with the 25 
port to which the filer was connected. Another noted dis- 
advantage of this switch zoning method is that if zoning 
needs to be modified, an interruption of service occurs 
as the switches must be taken off-line to modify zoning. 
Any device attached to one particular zone can only be 30 
owned by another device within that zone. It is possible 
to have multiple filers within a single zone; however, 
ownership issues then arise as to the disks within that 
zone. 

[0011] The need, thus, arises for a technique for a filer 35 
to determine which disks it owns other than through a 
hardware mechanism and zoning contained within a 
switch. This disk ownership in a networked storage 
methodology would permit easier scalability of net- 
worked storage solutions. 40 

Summary of the Invention 

[001 2] One aspect of the invention overcomes the dis- 
advantages of the prior art by providing a system and 45 
method of implementing disk ownership by respective 
file servers without the need for direct physical connec- 
tion or switch zoning within fibre channel (or other) 
switches. A two-part ownership identification system 
and method is defined. The first part of this ownership so 
method is the writing of ownership information to a pre- 
determined area of each disk. Within the system, this 
ownership information acts as the definitive ownership 
attribute. The second part of the ownership method is 
the setting of a SCSI-3 persistent reservation to allow 55 
only the disk owner to write to the disk. This use of a 
SCSI-3 persistent reservation allows other filers to read 
the ownership information from the disks. It should be 
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noted that other forms of persistent reservations can be 
usee in accordance with the invention. For example, if 
a SCSI level 4 command set is generated that includes 
pers.stent reservations operating like those contained 
wi:Np the SCSN3 command, these new reservations 
are expressly contemplated to be used in accordance 
witr :he invention. 

[0013] By utilizing this ownership system and method, 
any number of file servers connected to a switching net- 
work can read from, but not write to, ail of the disks con- 
nected to the switching network. In general, this novel 
ownership system and method enables any number of 
file servers to be connected to one or more switches or- 
ganized as a switching fabric with each file server being 
able to read data from all of the disks connected to the 
switching fabric. Only the file server that presently owns 
a particular disk can write to a given disk. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0014] The above and further advantages of the in- 
vention may be better understood by referring to the fol- 
lowing description in conjunction with the accompanying 
drawings in which like reference numerals indicate iden- 
tical or functionally similar elements: 

Fig. 1 , already described, is a schematic block dia- 
gram of a network environment showing the prior 
art of a filer directly connected to fibre channel loop; 

Fig. 2 is a schematic block diagram of a network 
environment including various network devices in- 
cluding exemplary file servers and associated vol- 
umes; 

Fig. 3 is a schematic block diagram of an exemplary 
storage appliance in accordance with Fig. 2; 

Fig. 4 is a schematic block diagram of a storage op- 
erating system for use with the exemplary file server 
of Fig. 3 according to an embodiment of this inven- 
tion; 

Fig. 5 is a block diagram of an ownership table 
maintained by the ownership layer of the storage 
operating system of Fig. 4 in accordance with an 
embodiment of this invention; and 

Fig. 6 is a flow chart detailing the steps performed 
by the storage operating system upon boot up to 
obtain ownership information of all disks connected 
to fibre channel switches connected to the individ- 
ual filer. 
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Detailed Description of an illustrative Embodiment 

A. Network Environment 

[0015] Fig. 2 is a schematic block diagram of an ex- 
emplary network environment 200 in which the princi- 
ples of the present invention are implemented. This net- 
work is based around a LAN 1 02 and includes a plurality 
of clients such as a network cache 108, personal com- 
puters 1 04, servers 1 06, and a switch/router 1 1 0 for con- 
nection to the well : known Internet. 
[001 6] Exemplary file servers, filers A and B, are also 
connected to the LAN. Filers A and B are also connected 
to a switch S1 . The switch S1 is preferably a fibre chan- 
nel switch containing a plurality of ports P1 , P2, P3, P4 
and P5. One example of a fibre channel switch is the 
Silkworm 6400™ available from Brocade Communica- 
tions Systems, Inc. of San Jose, CA. It should be noted 
that it is expressly contemplated that other forms of 
switches may be utilized in accordance with the present 
invention. 

[0017] Attached to the various ports of switch S1 in- 
clude fibre channel loops L1 and L2 and a second switch 
S2. Attached to a port P7 of switch S2 is a third fibre 
channel loop L3. Each of the fibre channel loops has a 
plurality of disks attached thereto. In an illustrative con- 
figuration, ports P3 and P6 can also be linked to enable 
switches to communicate as if they are part of a single 
switching fabric. It should be noted that each port of a 
switch is assumed to be identical. As such, fibre channel 
loops, filers or other switches can be connected to any 
port. The port numbers given here are for illustrative pur- 
poses only. 

[0018] It is preferred to have only one filer own an in- 
dividual disk. This singular ownership prevents conflict- 
ing data writes and helps to ensure data integrity. Switch 
zoning permits individual ports of a switch to be associ- 
ated into a zone. As an illustrative example, ports P1 
and P5 of switch S1 could be associated into a single 
zone. Similarly, ports P2 and P4 could be zoned togeth- 
er. This association is made within the individual switch 
using appropriate switch control hardware and software. 
This switch zoning creates, in effect, a "hard" partition 
between individual zones. Note also that the number of 
switches and ports and their configuration is highly var- 
iable. A device attached to a switch can only see and 
access other devices within the same zone. To change 
zoning, for example, to move the fibre channel loop at- 
tached to port P4 from one zone to another, typically re- 
quires taking the entire file server off-line for a period of 
time. 

[001 9] To overcome the disadvantages of the prior art, 
ownership information is written to each physical disk. 
This ownership information permits multiple filers and 
fibre channel loops to be interconnected, with each filer 
being able to see all disks connected to the switching 
network. By "see" it is meant that the filer can recognize 
the disks present and can read data from the disks. Any 
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filer is then able to read data from any disk, but only the 
filer that owns a disk may write data to it. This ownership 
information consists of two ownership attributes. The 
first attribute is ownership information written to a pre- 

s determined area of each disk. This predetermined area 
is called sector S. This sector S can be any known and 
constant location on each of the disks. In one embodi- 
ment, sector S is sector zero of each of the disks. 
[0020] The second attribute is Small Computer Sys- 

10 tern Interface (SCSI) level 3 persistent reservations. 
These SCSl-3 reservations are described in SCSI Pri- 
mary Commands - 3, by Committee T1 0 of the National 
Committee for Information Technology Standards, 
which is incorporated fully herein by reference. By using 

15 SCSI-3 reservations, non-owning file servers are pre- 
vented from writing to a disk; however, the non-owning 
file servers can still read the ownership information from 
a pre-determined location on the disk.. In a preferred 
embodiment, the ownership information stored in sector 

20 s acts as the definitive ownership data. In this preferred 
embodiment, if the SCSI-3 reservations do not match 
the sector S data, the sectors ownership is used. 

B. File Servers 

25 

[0021] Fig. 3 is a more-detailed schematic block dia- 
gram of illustrative Filer A that is advantageously used 
with this invention. Other filers can have similar con- 
struction, including, for example, Filer B. Byway of back- 
30 ground, a file server, embodied by a filer, is a computer 
that provides file service relating to the organization of 
information on storage devices, such as disks. In addi- 
tion, it will be understood to those skilled in the art that 
the inventive technique described herein may apply to 
35 any type of special-purpose computer (e.g., server) or 
general-purpose computer, including a standalone com- 
puter, embodied as a file server. Moreover, the teach- 
ings of this invention can be adapted to a variety of file 
server architectures including, but not limited to, a net- 
40 work-attached storage environment, a storage area net- 
work and disk assembly directly- attached to a client/ 
host computer. The term "file server" should therefore 
be taken broadly to include such arrangements. 
[0022] The file server comprises a processor 302, a 
<5 memory 304, a network adapter 306 and a storage 
adapter 308 interconnected by a system bus 310. The 
file server also includes a storage operating system 312 
that implements a file system to logically organize the 
information as a hierarchical structure of directories and 
so files on the disk. Additionally, a non-volatile RAM 
(NVRAM) 31 8 is also connected to the system bus. The 
NVRAM is used for various filer backup functions ac- 
cording to this embodiment. In addition, within the 
NVRAM is contained a unique serial number 320. This 
55 serial number 320 is preferably generated during the 
manufacturing of the file server; however, it is contem- 
plated that other forms of generating the serial number 
may be used, including, but not limited to using a general 
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purpose computer's microprocessor identification 
number, the file server's media access code (MAC) ad- 
dress, etc. 

[0023] In the illustrative embodiment, the memory 304 
may have storage locations that are addressable by the 
processor for storing software program code or data 
structures associated with the present invention. The 
processor and adapters may, in turn, comprise process- 
ing elements and/or logic circuitry configured to execute 
the software code and manipulate the data structures. 
The storage operating system 31 2, portions of which are 
typically resident in memory and executed by the 
processing elements, functionally organize a file server 
by inter-alia invoking storage operations in support of a 
file service implemented by the file server. It will be ap- 
parent by those skilled in the art that other processing 
and memory implementations, including various com- 
puter readable media may be used for storing and exe- 
cuting program instructions pertaining to the inventive 
technique described herein. 

[0024] The network adapter 306 comprises the me- 
chanical, electrical and signaling circuitry needed to 
connect the file server to a client over the computer net- 
work, which as described generally above, can com- 
prise a point-to-point connection or a shared medium 
such as a LAN. A client can be a general-purpose com- 
puter configured to execute applications including file 
system protocols, such as the Network File System 
(NFS) or the Common Internet File System (CIFS) pro- 
tocol. Moreover, the client can interact with the file serv- 
er in accordance with the client/server model of infor- 
mation delivery. The storage adapter cooperates with 
ine storage operating system 312 executing in the file 
server to access information requested by the client. 
The information may be stored in a number of storage 
volumes (Volume 0 and Volume 1) each constructed 
from an array of physical disks that are organized as 
RAID groups (RAID GROUPS 1, 2 and 3). The RAID 
groups include independent physical disks including 
those storing a striped data and those storing separate 
parity data. In accordance with a preferred embodiment 
RAID 4 is used. However, other configurations (e.g., 
RAID 5) are also contemplated. 

[0025] The storage adapter 308 includes input/output 
interface circuitry that couples to the disks over an I/O 
interconnect arrangement such as a conventional high- 
speed/high-performance fibre channel serial link topol- 
ogy. The information is retrieved by the storage adapter, 
and if necessary, processed by the processor (or the 
adapter itself) prior to being forwarded over the system 
bus to the network adapter, where the information is for- 
matted into a packet and returned to the client. 
[0026] To facilitate access to the disks, the storage op- 
erating system implements a file system that logically 
organizes the information as a hierarchical structure of 
directories in files on the disks. Each on-disk file may be 
implemented as a set of disk blocks configured to store 
information such as text, whereas the directory may be 



implemented as a specially formatted file in which other 
files and directories are stored. In the illustrative embod- 
iment described herein, the storage operating system 
associated with each volume is preferably the NetApp® 

5 Data CNTAP storage operating system available from 
Network Appliance Inc. of Sunnyvale, California that im- 
plements a Write Anywhere File Layout (WAF L) file sys- 
tem. The preferred storage operating system for the ex- 
emplary file serve*- is now described briefly. However, it 

10 is expressly contemplated that the principles of this in- 
vention can be implemented using a variety of alternate 
storage operating system architectures. 
[0027] The host adapter 316, which is connected to 
the storage adapter of the file server, provides the file 

'5 server with a unique world wide name, described further 
below. 

C. Storage Operating System and Disk Ownership 

20 [0028] As shown in Fig. 4, the storage operating sys- 
tem 31 2 comprises a series of software layers including 
a media access layer 402 of network drivers (e.g., an 
Ethernet driver). The storage operating system further 
includes network protocol layers such as the Internet 

25 Protocol (IP) layer 404 and its Transport Control Proto- 
col (TCP) layer 406 and a User Datagram Protocol 
(UDP) layer 408. A file system protocol layer provides 
multi-protocol data access and, to that end, includes 
support from the CIFS protocol 410, the Network File 

30 System (NFS) protocol 412 and the Hypertext Transfer 
Protocol (HTTP) protocol 414. 

[0029] In addition, the storage operating system 312 
includes a disk storage layer 41 6 that implements a disk 
storage protocol such as a RAID protocol, and a disk 

35 driver layer 41 8 that implements a disk access protocol 
such as e.g., a Small Computer System Interface (SCSI) 
protocol. Included within the disk storage layer 41 6 is a 
disk ownership layer 420, which manages the owner- 
ship of the disks to their related volumes. Notably, the 

40 disk ownership layer includes program instructions for 
writing the proper ownership information to sector S and 
to the SCSI reservation tags. 

[0030] As used herein, the term "storage operating 
system" generally refers to the computer-executable 

<5 code operable on a storage system that implements file 
system semantics (such as the above-referenced 
WAFL) and manages data access. In this sense, ON- 
TAP software is an example of such a storage operating 
system implemented as a microkernel. The storage op- 

50 erating system can also be implemented as an applica- 
tion program operating over a general-purpose operat- 
ing system, such as UNIX® or Windows NT®, or as a 
general-purpose operating system with configurable 
functionality, which is configured for storage applica- 

55 ttons as described herein. 

[0031 ] Bridging the disk software layers, with the net- 
work and file system protocol layers, is a file system lay- 
er 424 of the storage operating system. Generally, the 



5 



9 



EP 1 321 848 A2 



10 



file system layer 424 implements the file system having 
an on-disk file format representation that is a block 
based. The file system generated operations to load/re- 
trieve the requested data of volumes if it not resident "in 
core," i.e. s in the file server's memory. If the information 
is not in memory, the file system layer indexes into the 
inode file using the inode number to access an appro- 
priate entry and retrieve a logical block number. The file 
system layer then passes the logical volume block 
number to the disk storage/RAID layer, which maps out 
logical number to a disk block number and sends the 
later to an appropriate driver of a disk driver layer. The 
disk driver accesses the disk block number from vol- 
umes and loads the requested data into memory for 
processing by the file server. Upon completion of the re- 
quest, the file server and storage operating system re- 
turn a reply, e.g., a conventional acknowledgement 
packet defined by the CIFS specification, to the client 
over the network. It should be noted that the software 
"path" 41 8 through the storage operating system layers 
described above needed to perform data storage ac- 
cess for the client received the file server may ultimately 
be implemented in hardware, software or a combination 
of hardware and software (firmware, for example). 
[0032] Included within the ownership layer 420 is a 
disk table 422 containing disk ownership information as 
shown in Fig. 5. This disktable 422 is generated at boot- 
up of the file server, and is updated by the various com- 
ponents of the storage operating system to reflect 
changes in ownership of disks. 

[0033] Fig. 5 is an illustrative example of the disk table 
422 maintained by the ownership layer of the storage 
operating system. The table comprises a plurality of en- 
tries 510, 520, 530 and 540, one for each disk accessi- 
ble by the subject file server. Illustrative entry 520 in- 
cludes fields for the drive identification 502, world wide 
name 504, ownership information 506 and other infor- 
mation 508. The world wide name is a 64-byte identifi- 
cation number which is unique for every item attached 
to a fibre channel network. World wide names are de- 
scribed in ANSI X3.230-1995, Fibre Channel Physical 
and Signaling Interface (FC-PH) and Bob Snively, New 
Identifier Formats Based on IEEE Registration 
X3T1 1/96-467, revision 2. The world wide name is gen- 
erally inserted into disk drives during their manufactur- 
ing process. For file servers, the world wide name is nor- 
mally generated by adding additional data bits to the file 
server serial number contained within the NVRAM. 
However, it is expressly contemplated that other means 
for generating a world wide name (or other appropriate 
standardized unique naming scheme) for file servers 
are possible, including, but not limited to adding the 
manufacturer's name to a processor identification, etc. 
[0034] Fig. 6 is a flow chart detailing the steps that the 
various layers of the storage operating system of a file 
server undergo upon initialization to generate the initial 
disk ownership table. In step 602, the I/O services and 
disk driver layer queries all devices attached to the 



switching network. This query requests information as 
to the nature of the device attached. Upon the comple- 
tion of the query, in step 604, the ownership layer 420 
(Fig. 4) instructs the disk driver layer 418 to read the 

5 ownership information from each disk drive. The disk 
driver layer reads the sector S ownership information 
from each physical disk drive identified in the previous 
step. The ownership layer then creates the ownership 
table 422 in step 606. 

10 [0035] The ownership layer 420 extracts from the disk 
ownership table 422 the identification of all disks that 
are owned by this subject file server. The ownership lay- 
er then, in step 610, verifies the SCSI reservations on 
each disk that is owned by that file server by reading the 

is ownership information stored in sector S. If the SCSI 
reservations and sector S information do not match, the 
ownership layer will, in step 61 4, change the SCSI res- 
ervation to match the sector S ownership information. 
Once the SCSI reservations and sector S ownership in- 

20 formation match for all the disks identified as being 
owned by the file server the ownership layer will then 
pass the information to the disk storage layer for that 
layer to configure the individual disks into the appropri- 
ate RAID groups and volumes for the file server. 

25 [0036] The disk ownership layer also provides an ap- 
plication program interface (API) which is accessible by 
various other layers of the storage operating system. 
For example, the disk migration layer often undertakes 
to access the disk table to determine current disk own- 

30 ership. The disk migration layer is described in co-pend- 
ing European Patent Application entitled SYSTEM AND 
METHOD FOR TRANSFERRING VOLUME OWNER- 
SHIP IN NETWORKED STORAGE by Joydeep Sen 
Sarma et al. Additionally, a preselection process, which 

35 js part of an administrative graphical user interface 
(GUI), utilizes the API to access information in the disk 
ownership table. This preselection process is described 
in co-pending European Patent Application, titled 
METHOD FOR PRESELECTING CANDIDATE DISKS 

40 BASED ON VALIDITY FOR VOLUM E by Steven Klinkn- 
er 

[0037] Additionally, the disk ownership layer contin- 
ues to update the disk ownership table during the oper- 
ation of the file server. Thus, when the disk topology 
45 changes, the switches involved report the changes to 
connected file servers. The file servers then update their 
respective disk ownership tables by executing the meth- 
od described above. 

[0038] The foregoing has been a detailed description 
50 of the invention. Various modification and additions can 
be made without departing from the scope of this inven- 
tion. Furthermore, it is expressly contemplated that the 
processes shown and described according to this inven- 
tion can be implemented as software, consisting of a 
55 computer-readable medium including program instruc- 
tions executing on a computer, as hardware or firmware 
using state machines and the alike, or as a combination 
of hardware, software, and firmware. Since the present 
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invention can be implemented as a computer program, 
the present invention encompasses any suitable carrier 
medium carrying the computer program for input to and 
execution by a computer. The carrier medium can com* 
prise a transient carrier medium such as a signal e.g. 
an electrical, optical, microwave, magnetic, electromag- 
netic or acoustic signal, or a storage medium e.g. a flop- 
py disk, hard disk, optical disk, magnetic tape, or solid 
state memory device. Additionally, it is expressly con- 
templated that other devices connected to a network 
can have ownership of a disk in a network environment. 
Accordingly, this description is meant to be taken only 
by way of example and not to otherwise limit the scope 
of this invention. 



Claims 

1 . A method for a network device to claim ownership 
of a disk in a network storage system comprising 
the steps of: 

setting a first ownership attribute on the disk to 
a state of ownership by network device; and 
setting a second ownership attribute on the disk 
to a state of ownership by network device. 

2. The method of claim 1 , wherein one of the first own- 
ership attribute and the second ownership attribute 
further comprises a small computer system inter- 
face level 3 persistent reservation tag. 

3. The method of claim 1 , wherein one of the first own- 
ership attribute and the second ownership attribute 
further comprises ownership information written on 
a predetermined area of the disk. 

4. The method of claim 3, wherein the ownership in- 
formation further comprises a serial number of the 
network device. 

5. The method of any preceding claim, wherein the 
network device comprises a file server. 

6. A method of claiming ownership of a disk by a net- 
work device in a network storage system compris- 
ing the steps of: 

writing ownership information to a predeter- 
mined area of the disk; and 
setting a small computer system interface level 
3 persistent reservation tag to a state of net- 
work device ownership. 

7. The method of claim 6 wherein the ownership infor- 
mation further comprises a serial number of a net- 
work device. 



8. The method of claim 6 or claim 7, wherein the net- 
work device comprises a file server. 

9. A network storage system comprising: 

5 

a plurality of network devices; 
one or more switches, each network device 
connected to at least one of the one or more 
switch; and 

J 0 a plurality of disks naving a 'irst ownership at- 

tribute and a second ownership attribute, each 
disk connected to at least one of the plurality of 
switches. 

is 10. The network storage system of claim 9, wherein the 
first ownership attribute further comprises owner- 
ship information written on a predetermined area of 
the disk. 

20 11. The network storage system of claim 9 or claim 10, 
wherein the second ownership attribute further 
comprises a small computer system interface level 
3 persistent reservation tag. 

25 1 2. The networked storage system of claim 1 1 , wherein 
each disk that is owned by the network device has 
the small computer system interface level 3 persist- 
ent reservation set such that only the network de- 
vice may write to the disk. 

30 ■' 

13. The network storage system of claim 10, wherein 
the ownership information further comprises of a 
serial number of the network device that owns that 
particular disk. 

35 

1 4. The network storage system of any one of claims 9 
to 13, wherein each of the plurality of file servers 
can read data from each of the plurality of disks. 

40 1 5. The network storage system of any one of claims 9 
to 14, wherein only a network device that owns one 
of the plurality of disks can write data to the one disk. 

1 6. The network storage system of any one of claims 9 
<5 to 15, wherein the network devices comprise file 

servers. 

17. A network storage system comprising: 

50 one or more switches; 

a plurality of disks; and 

a plurality of network devices, each of the net- 
work devices including means for claiming 
ownership of one of the plurality of disks in the 
55 network storage system. 

18. The network storage system of claim 17, wherein 
the means for claiming ownership further compris- 
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means for writing ownership information to a 
predetermined area of a disk; and 
means for setting a small computer system in- 5 
terface level 3 persistent reservation on a disk. 

19. The network storage system of claim 17 or claim 
1 8, wherein the network devices comprise file serv- 
ers. 10 

20. A network storage system comprising: 

one or more switches interconnected to form a 
switching fabric; 15 
a plurality of disks, each of the disks connected 
to at least one of the switches; and 
one or more network devices, interconnected 
with the switching fabric, each of the network 
devices being adapted to own a predetermined 20 
set of disks of the plurality of disks. 

21. The network storage system of claim 20, wherein 
the plurality of disks further comprises a first own- 
ership attribute and a second ownership attribute. 25 

22. The network storage system of claim 21 , wherein 
the first ownership attribute is ownership informa- 
tion written to a predetermined area of each of the 
disks. 30 

23. The network storage system of claim 22, wherein 
the ownership information further comprises a seri- 
al number of one of the one or more network devic- 
es. 35 

24. The network, storage system of any one of claims 
21 to 23, wherein the second ownership information 
is a small computer system interface level 3 persist- 
ent reservation. 40 

25. The network storage system of any one of claims 
20 to 24, wherein each of the network devices fur- 
ther comprises a disk ownership table, the disk 
ownership table containing ownership data for each 
of the disks. 

26. The network storage system of claim 25, wherein 
the ownership table further comprises a world wide 
name for each of the disks, the world wide name so 
being used for identification of each of the disks. 

27. A carrier medium carrying computer readable code 
for controlling a computer to carry out the method 

of any one of claims 1 to 8. 55 
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